Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem fix [v2] #13

Open
wants to merge 6 commits into
base: dreamer-torch1.8.2
Choose a base branch
from

Conversation

sumwailiu
Copy link

@sumwailiu sumwailiu commented Feb 15, 2025

There are some mistakes in the first version of the problem fix (#12), although I indeed got better final results.

Something to be corrected:

  • The origin computation of reward_loss is indeed correct and it should not be modified.

Main changes:

The factors contributing to the issue #4 :

  • @letusfly85 and @coderlemon17 found that their training results are not as good as the testing ones of the origin paper (around 700 at 1M steps). This is reasonable because noise is added at action during training while the noise is removed when testing. Here are the training and testing results of the origin dreamer-pytoch (i.e., the version before applying my fixes) in walker-run env, where epoch 1000 stands for 1M steps.
    ori_train
    ori_test

  • Planning horizon is another factor. Here is the result of the dreamer-pytorch + Fix 2, where the testing return is around 700 at epoch 1000 (1M steps).

ori_15

The testing result of dreamer-pytorch + Fix 1&2 is almost the same as that of dreamer-pytorch + Fix 2:
fix_test

However, their value losses are different, where the value loss of dreamer-pytorch + Fix 1&2 is lower than that of dreamer-pytorch + Fix 1:

  • dreamer-pytorch + Fix 2:
    ori_val
  • dreamer-pytorch + Fix 1&2:
    fix_val

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant